منابع مشابه
Variable selection using random forests
This paper proposes, focusing on random forests, the increasingly used statistical method for classification and regression problems introduced by Leo Breiman in 2001, to investigate two classical issues of variable selection. The first one is to find important variables for interpretation and the second one is more restrictive and try to design a good prediction model. The main contribution is...
متن کاملVariable Selection Using Random Forests
One of the main topic in the development of predictive models is the identification of variables which are predictors of a given outcome. Automated model selection methods, such as backward or forward stepwise regression, are classical solutions to this problem, but are generally based on strong assumptions about the functional form of the model or the distribution of residuals. In this paper a...
متن کاملVSURF: An R Package for Variable Selection Using Random Forests
This paper describes the R package VSURF. Based on random forests, and for both regression and classification problems, it returns two subsets of variables. The first is a subset of important variables including some redundancy which can be relevant for interpretation, and the second one is a smaller subset corresponding to a model trying to avoid redundancy focusing more closely on the predict...
متن کاملVariable Selection in Time Series Forecasting Using Random Forests
Time series forecasting using machine learning algorithms has gained popularity recently. Random forest is a machine learning algorithm implemented in time series forecasting; however, most of its forecasting properties have remained unexplored. Here we focus on assessing the performance of random forests in one-step forecasting using two large datasets of short time series with the aim to sugg...
متن کاملVariable selection with Random Forests for missing data
Variable selection has been suggested for Random Forests to improve their efficiency of data prediction and interpretation. However, its basic element, i.e. variable importance measures, can not be computed straightforward when there is missing data. Therefore an extensive simulation study has been conducted to explore possible solutions, i.e. multiple imputation, complete case analysis and a n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition Letters
سال: 2010
ISSN: 0167-8655
DOI: 10.1016/j.patrec.2010.03.014